Search CORE

21 research outputs found

VLSlice: Interactive Vision-and-Language Slice Discovery

Author: Kahng Minsuk
Lee Stefan
Slyman Eric
Publication venue
Publication date: 13/09/2023
Field of study

Recent work in vision-and-language demonstrates that large-scale pretraining can learn generalizable models that are efficiently transferable to downstream tasks. While this may improve dataset-scale aggregate metrics, analyzing performance around hand-crafted subgroups targeting specific bias dimensions reveals systemic undesirable behaviors. However, this subgroup analysis is frequently stalled by annotation efforts, which require extensive time and resources to collect the necessary data. Prior art attempts to automatically discover subgroups to circumvent these constraints but typically leverages model behavior on existing task-specific annotations and rapidly degrades on more complex inputs beyond "tabular" data, none of which study vision-and-language models. This paper presents VLSlice, an interactive system enabling user-guided discovery of coherent representation-level subgroups with consistent visiolinguistic behavior, denoted as vision-and-language slices, from unlabeled image sets. We show that VLSlice enables users to quickly generate diverse high-coherency slices in a user study (n=22) and release the tool publicly.Comment: Conference paper at ICCV 2023. 17 pages, 11 figures. https://ericslyman.com/vlslice

arXiv.org e-Print Archive

CNN 101: Interactive Visual Learning for Convolutional Neural Networks

Author: Chau Duen Horng
Das Nilaksh
Hohman Fred
Kahng Minsuk
Park Haekyu
Shaikh Omar
Turko Robert
Wang Zijie J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/02/2020
Field of study

The success of deep learning solving previously-thought hard problems has inspired many non-experts to learn and understand this exciting technology. However, it is often challenging for learners to take the first steps due to the complexity of deep learning models. We present our ongoing work, CNN 101, an interactive visualization system for explaining and teaching convolutional neural networks. Through tightly integrated interactive views, CNN 101 offers both overview and detailed descriptions of how a model works. Built using modern web technologies, CNN 101 runs locally in users' web browsers without requiring specialized hardware, broadening the public's education access to modern deep learning techniques.Comment: CHI'20 Late-Breaking Work (April 25-30, 2020), 7 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Human-centered AI through scalable visual data analytics

Author: Kahng Minsuk Brian
Publication venue: Georgia Institute of Technology
Publication date: 14/01/2020
Field of study

While artificial intelligence (AI) has led to major breakthroughs in many domains, understanding machine learning models remains a fundamental challenge. How can we make AI more accessible and interpretable, or more broadly, human-centered, so that people can easily understand and effectively use these complex models? My dissertation addresses these fundamental and practical challenges in AI through a human-centered approach, by creating novel data visualization tools that are scalable, interactive, and easy to learn and to use. With such tools, users can better understand models by visually exploring how large input datasets affect the models and their results. Specifically, my dissertation focuses on three interrelated parts: (1) Unified scalable interpretation: developing scalable visual analytics tools that help engineers interpret industry-scale deep learning models at both instance- and subset-level (e.g., ActiVis deployed by Facebook); (2) Data-driven model auditing: designing visual data exploration tools that support discovery of insights through exploration of data groups over different analytics stages, such as model comparison (e.g., MLCube) and fairness auditing (e.g., FairVis); and (3) Learning complex models by experimentation: building interactive tools that broaden people's access to learning complex deep learning models (e.g., GAN Lab) and browsing raw datasets (e.g., ETable). My research has made significant impact to society and industry. The ActiVis system for interpreting deep learning models has been deployed on Facebook's machine learning platform. The GAN Lab tool for learning GANs has been open-sourced in collaboration with Google, with its demo used by more than 70,000 people from over 160 countries.Ph.D

Scholarly Materials And Research @ Georgia Tech